Even in the best of cases, where much is known about a
future solution’s peak transaction workload or typical end user’s work
habits, SAP solution sizing still remains an iterative process, as much
art as science. Understanding SAP’s architecture can pay big dividends,
therefore, when it comes to the sizing process. Of course, to gain more
than a cursory understanding of the internal architecture employed by
mySAP components requires considerable training, coupled with a number
of years of experience. But a basic understanding of the core concepts
behind the operation of an SAP system will help smooth out some of the
bumps during the sizing process. And this knowledge should help you in
maintaining an apples-to-apples sizing comparison between different
hardware and other SAP technology partners.
Each mySAP component can be architected to take advantage of something called a three-tiered client/server architecture.
Many years ago, SAP realized the advantages of separating the
application’s logic from the database. The three technology layers that
came of this—database, application, and front-end client—came to be
known as a three-tiered architecture. In breaking each layer out this
way, each could be scaled,
or grown, which at the time was a very different approach from the
monolithic mainframe solutions of the day, where growth meant tossing
out your current mainframe and lugging in a bigger one.
SAP also architected
the three layers such that they could reside on a single physical
machine, or could be combined in different ways. The result was a very
flexible and—based on the number of SAP deployments—a very successful
architecture. Today, every layer, even the database server, which
handles all database transactions, can be scaled through products like
Oracle’s 9i Real Application Clusters.
If we think of the database as the first layer in the three-tiered client/server architecture, the application component, called the Application Server, is the second layer. It is very common to see anywhere from 2 or 3 to perhaps 10 or 12 application servers in
a single system. In this way, the system’s processing power is easily
increased as a system’s utilization or requirement to host a greater
number of users increases and therefore more horsepower is necessary.
The runtime element of mySAP components is referred to as the kernel, which spawns a number of SAP work processes,
each serving different functions—work processes are created
specifically to support online users, background or batch processes,
printing, database updates, and so on.
Another unique element of the application layer is the Central Instance,
which takes care of handling database locks, interapplication server
messaging, and other core housekeeping activities; without the Central
Instance, there is no SAP installation. Oftentimes, the Central Instance
(CI) actually runs with one of the Application Servers dedicated to
servicing end-user transactions. For the most robust configurations,
though, SAP allows us to actually relocate the CI to its own physical
server. This is so that the very processor-intensive application servers
can be granted the resources of an entire server, instead of being
forced to share CPU and memory with the CI. In other words, a separate
CI server helps ensure the system can respond instantly to the next
request without waiting for resources (primarily CPU) to be freed from
some non-CI related usage.
The third layer in the
three-tiered client/server architecture is the presentation layer, which
simply means front-end clients. It is with these front-end clients
(desktops, laptops, wireless handheld devices, and so on), that users
connect to the SAP system using a Web browser or SAP’s own user
interface, the SAPGUI. In many cases nowadays, an additional tier, the
Internet or a company’s intranet, exists, too. This tier actually
resides between the application and client tiers, and in effect extends
both the application logic and the network of mySAP solutions. Thus, a
four-tier solution is born in these cases.
Although my focus thus far
has been on three-tier environments, remember that SAP’s architecture is
flexible and can easily be adopted to support two tiers as well. Simply
put, if the database server and application server execute on the same
physical server, you have a two-tier system. In this kind of
environment, end users connect directly to the central instance, whereas
in a three-tier environment, end users connect to a specific
application server or pre-established group of servers called a logon group
(though this connection is still intelligently handled by the CI). And
whether you have architected a two-, three-, or four-tiered SAP system,
all communication between the different tiers takes place over TCP/IP
(with some exceptions in two-tier systems that leverage
process-to-process communications, which are outside the scope of this
introduction to SAP architecture).
Now, armed
with a better understanding of what SAP architecture entails, let’s move
into the next section where different sizing methodologies are put into
practice to architect specific mySAP solutions.
Understanding Different Sizing Methodologies
In
addition to a full-fledged top-to-bottom solution stack approach to SAP
sizing, a number of other sizing methodologies and approaches are often
undertaken by different SAP technology partners. The key to any valid
sizing approach is to understand the workload being performed, so that a
hardware configuration with the proper number and speed of CPUs, RAM,
and disk drives can be assembled. Some sizing approaches are faster than
others, though, at the expense of sizing precision. For example, many
hardware vendors provide Budgetary Sizings
based solely on the expected number of active users to be supported by a
particular mySAP solution. In this way, a ballpark dollar figure can be
gleaned early in an SAP project without requiring all the time and
trouble of answering a comprehensive sizing questionnaire.
As I mentioned
earlier, SAP AG provides its own rendition of a budgetary sizing by
means of its Quick Sizer, an online tool most often leveraged for its
ability to perform rapid user-based mySAP sizings. Available at http://service.sap.com/quicksizing,
the Quick Sizer can also model mySAP solutions even more accurately
through an analysis of transactions and resulting outputs, in the form
of customer-provided quantity and structure-related data. For example,
business requirements that can be described in terms of the number of
expected financial documents, receipts, postings, average line items in a
typical order, and so on to be processed or created annually will more
accurately help an SAP sizing expert craft a hardware solution than will
an online user-based sizing approach.
This brings us to Transaction-Based Sizing.
As the name implies, this approach seeks to characterize and understand
the nature of end-to-end functional transactions being executed as part
of a particular mySAP component. In addition to the quantities and
structures already mentioned, peak processing hours and peak throughput
loads are also factored in, just as they should be in user-based sizing
exercises. Great care needs to be taken to avoid underestimating the
number of transactions to be performed by a particular system,
though—it’s easy to shortchange the sizing exercise. In my own
experience, I therefore try to do the following:
- Understand
what the new SAP system replaces, which can help me understand
potentially how many users in various functional areas might be using
SAP in the future. Again, though, great care must be taken not to
confuse the limited capabilities of a legacy system with a new SAP
solution. The SAP solution will generate many more transactions per
user, due to its greater capabilities and ties back into other
functional areas.
- Define the peak transaction processing
requirements, not just what the system will typically be doing
day-to-day. In other words, it’s important to discover what a particular
customer’s month-end or quarter-end close looks like from a transaction
load perspective, and whether any seasonal peaks exceed even this load.
Don’t forget to include both online and batch transactions.
- Explicitly
state assumptions. If a customer does not understand his batch job
requirements, or is unclear as to reporting requirements, I will take a
cut at this based on my own experience, and document my assumptions. In
this way, if the customer later learns what his exact requirements are,
it is a simple matter to refine the sizing document (and therefore avoid
accidentally doubling or tripling a load that had been previously
extrapolated but not clearly identified).
- Determine
transaction types and weighting. Not all transactions are equally
“heavy” in the eyes of SAP. Financial transactions, for example, may
only consist of four dialog steps whereas SD transactions are five to
six times heavier. Thus, not only do I determine the types of
transactions that will occur on a system, but I also seek to convert or
normalize all transactions into a similar genre (like SD, if I’m working
with R/3), by weighting many light transactions into heavier SD
transactions, and so on.
Other sizing approaches are quite common, too. A Delta Sizing
approach, for instance, is quite useful for customers already live on a
particular SAP product. The customer’s SAP Basis or adept SAP
Operations team can be easily directed through SAP’s Computer Center
Management System to identify the load observed real-time and
historically in terms of the number of dialog steps processed, so that
planned changes to the system (like adding an incremental number of
users or transactions) can be intelligently extrapolated.
The final and
most demanding sizing process that I am aware of is called something
akin to “customer-specific sizing benchmarks” or “customer performance
testing” or “proof-of-concept tests.” Regardless of the label, these
sizing exercises take much of the assumption and to some extent
guesswork out of sizing, replacing them instead with hard facts as to
the load that a particular SAP Solution Stack is capable of bearing.
Although a Proof-of-Concept, or POC,
can be time-consuming (not to mention expensive), the resulting peace
of mind is compelling. POCs share much in common with stress tests and
load tests, which are executed prior to Go-Live to ensure that a
production system is indeed capable of meeting performance metrics and
other service-level agreements made between the IT department and an
enterprise solution’s customers, its end users. For example:
A POC and a stress test are both focused on testing the performance and scalability of an SAP product.
In
both cases, testing is usually initiated from single-user tests, and
then scaled to a larger and larger number of front-end clients that
eventually represent what a customer will expect in the real world.
To
perform a POC or stress test, either the actual pre-production system
or a system configured identically to it must be installed, configured,
and tuned.
Real-world
data, and plenty of it, is required. In the early stages of sizing,
this is often the biggest factor in pulling off a successful POC, as
good data is hard to come by, much less lots of good data.
Access to onsite sizing and POC professionals can be a challenge, depending on the solution stack you wish to test.
The
overall expense can seem prohibitive, but as with an insurance policy,
you likely prefer to spend a little today to save a lot down the road.
Because of these factors,
in the end I’ve seen more user-based SAP solution sizing than anything
else. Of course, transaction-based sizing and to a lesser extent,
proof-of-concept testing will always be popular when risk is the
highest. This is especially true for large or otherwise complex mySAP
architectures, where transaction-based sizing should represent a minimum
requirement of sorts, so as to both accurately and conservatively size
your SAP project. And other sizing approaches exist, too, that target
specific unknowns. For example, “characterization testing” seeks to test
a specific function like batch processing or reporting, to learn how
much horsepower needs to be available to meet the required minimum
window of time to complete the batch processing or reporting. In those
cases where SAP allows a process to be broken down into parallel
processes and executed concurrently (called parallelization), such characterization is particularly important.
Sizing Tools, Practices, and Assumptions
SAP’s Quick
Sizer, along with all hardware vendors’ SAP sizing tools, must make
assumptions regarding what you seek in a solution. One of the most
important assumptions that you need to verify with your hardware vendor
involves how your specific SAP workload is distributed among the servers
in your solution. Each vendor and their SAP sizing tools makes
assumptions like these:
The load borne
by a system architected for three tiers is often split 33/67 or 25/75,
between the database server and the Central Instance combined with all
application servers, respectively. Verify these numbers are consistent
if you are working with multiple hardware vendors.
Batch and user/online loads may be distributed to dedicated servers, or shared. Verify how the tool addresses this, if at all.
When
clustering, each node in a cluster can be configured to perform work
while running in “normal” non-failover mode. Verify what both the normal
and failover workloads look like for each cluster node.
A sizing tool must also
make assumptions as to the specific version of a database release,
operating system version, and even mySAP release! My advice is to verify
with your hardware or software vendor that any specific SAP Solution
Stack components you require are indeed addressed by their toolsets and
approaches. It makes little sense, for example, for a hardware vendor to
use its SAP/UNIX sizer for a specifically requested Windows solution.
The same goes for specific versions of databases and mySAP
components—each version has different processing, memory, and often even
disk requirements. Using a tool incapable of addressing your specific
solution stack makes the output derived from that tool suspect.
Similarly,
attention needs to be given to the methodology employed to determine how
large your SAP database will be in two or three years, and what the
growth chart will look like over time. Different database sizing
approaches have evolved over the years; verify that your prospective
vendors are using the same method, or in some other way guide them
toward agreeing on a number that makes sense to everyone. I like to size
an SAP database for three years’ growth, for example.
Another
important assumption has to do with system utilization numbers. When a benchmark is run,
the system being tested is usually stressed up to the point where the
average response time observed by each user is just under two seconds.
Doing so typically pushes CPU utilization to a maximum of 99 or 100%. In
the real world, though, when sizing SAP solutions, hardware vendors
need to make assumptions as to what kind of utilization thresholds you
are comfortable with. These vary depending upon the vendor, but
typically resemble the following de facto standards:
Servers are
sized such that the average CPU utilization over time is 65% or so. In
other words, the system might spike to 100%, or sit nearly idle
occasionally, but generally will hover around the 65% mark.
Of
this 65% utilization, fully half is dedicated to user-based dialog
processing, and the other half to a combination of batch processing,
printing, interface processing, and reporting.
The
remaining 33% worth of “capacity” remains available to provide capacity
to initiate new work with minimal delay which, in turn, results in
predictably good response times. This extra capacity also helps to
address unforeseen or unplanned future workloads.
Be careful that each of
these assumptions is clearly documented in the sizing documents you
receive from each hardware vendor. Differences in assumptions can make
an enormous difference in the solution proposed by one vendor over
another, for instance. I know of one hardware vendor in particular who
in the past has deviated from these standards for the express purpose of
making their hardware solutions seem
more robust than the competition’s: By sizing for the full 100%
capacity of a server, their solutions therefore appeared to require less
RAM and CPU processing power. This helped them undercut other hardware
vendor’s proposals when in fact their less-than-customer-focused tactics
only left their clients with premature performance problems that
eventually had to be addressed before Go-Live.
Best Practices Regarding System Landscape Design
To ensure
apples-to-apples sizings, I recommend that you plainly direct each
potential hardware vendor to size for identical system landscapes. In
other words, do not leave this up to their discretion (unless your goal
is to simply see at a high level what kind of unique solution each
vendor can craft to solve your business problem). For example, you may
want to be explicit about how each vendor should address high
availability. It is better to indicate “include clusters for HA and SQL
Server log shipping for DR” rather than only stating a 99.9% uptime
requirement and allowing each vendor to determine how to address this
themselves.
And be clear as to which
SAP system landscape components you want to see included in your sizing.
A four-system landscape can be interpreted in many different ways—one
vendor might make the fourth system training, another configures a
technical sandbox, and a third vendor gives you a staging system. The
same approach is true for database sizing—clearly indicate where you
wish to host copies of your full production size database, and where
smaller development or sandbox databases are appropriate.
With regard to the system
landscape, you also must be clear about whether a fourth tier is
required, and what exactly that entails. And you need to cover landscape
deployment options, like using instance stacking
to install multiple systems on one physical server. Stacking is quite
common in the Unix world of SAP (and to a much lesser extent, Windows),
where development, test, and training instances might all reside on one
very capable server rather than separate servers. Finally, you should
help push each vendor toward a consistent standard for sizing the
various systems within the system landscape. Specify, for example, where
minimal server, disk subsystem, and other hardware components should be
employed. Your Test system should be able to support a specific number
of users, or a specific percentage of the load to be eventually borne by
production. Similarly, your development system should be configured
robustly enough to keep your development team from walking off the
job—give your hardware vendors a specific target, like the ability to
support 20 high-activity developers, and this will help you to continue
to support an apples-to-apples sizing comparison.